Methods for the Inclusion of Real-World Evidence in a Rare Events Meta-Analysis of Randomized Controlled Trials

Background: Many rare events meta-analyses of randomized controlled trials (RCTs) have lower statistical power, and real-world evidence (RWE) is becoming widely recognized as a valuable source of evidence. The purpose of this study is to investigate methods for including RWE in a rare events meta-analysis of RCTs and the impact on the level of uncertainty around the estimates. Methods: Four methods for the inclusion of RWE in evidence synthesis were investigated by applying them to two previously published rare events meta-analyses: the naïve data synthesis (NDS), the design-adjusted synthesis (DAS), the use of RWE as prior information (RPI), and the three-level hierarchical models (THMs). We gauged the effect of the inclusion of RWE by varying the degree of confidence placed in RWE. Results: This study showed that the inclusion of RWE in a rare events meta-analysis of RCTs could increase the precision of the estimates, but this depended on the method of inclusion and the level of confidence placed in RWE. NDS cannot consider the bias of RWE, and its results may be misleading. DAS resulted in stable estimates for the two examples, regardless of whether we placed high- or low-level confidence in RWE. The results of the RPI approach were sensitive to the confidence level placed in RWE. The THM was effective in allowing for accommodating differences between study types, while it had a conservative result compared with other methods. Conclusion: The inclusion of RWE in a rare events meta-analysis of RCTs could increase the level of certainty of the estimates and enhance the decision-making process. DAS might be appropriate for inclusion of RWE in a rare event meta-analysis of RCTs, but further evaluation in different scenarios of empirical or simulation studies is still warranted.


Introduction
Well-conducted randomized controlled trials (RCTs) have been considered the gold standard, and as such are often used in meta-analyses to evaluate the effects of healthcare interventions [1]. For rare events meta-analyses of RCTs, the outcomes of the included trials may be very sparse, with no events even being observed in some trials [2,3], resulting in lower statistical power [4]. Recently, there has been a growing interest in the use of real-world evidence (RWE) in clinical drug assessments and health care evaluations [5][6][7]. RWE is clinical evidence from multiple sources outside the typical clinical research setting, including electronic health records and billing databases [5]. Regulatory bodies such as the US Food and Drug Administration (FDA) have been increasingly using RWE to monitor adverse post-market safety events and make regulatory decisions [5,8]. An important potential advantage of RWE is that it can represent real-life clinical practice across a broader and more diverse distribution of patients than RCTs [9]. The results from RWE studies, such as those obtained from epidemiology databases, health surveys, and electronic medical records, typically include larger sample sizes and have longer follow-up periods, increasing the probability of finding rare events [10][11][12].
RCTs and RWE studies are potentially valuable evidence sources for assessing the effect of rare events, and integrating RWE studies and RCTs to assess the effects of rare events may help increase the overall level of certainty of evidence [13,14]. However, the inclusion of RWE in a rare events meta-analysis of RCTs is not a straightforward issue, as the estimates obtained from RWE studies may be subject to bias due to the potential selection and information biases of real-world data [15]. Therefore, RCTs and RWE studies should not be directly integrated without considering the bias of the RWE studies. Generalized evidence synthesis statistical approaches for integrating RCTs and RWE studies can provide not only an integrated result but also a quantitative analysis of the influence of RWE on the integrated evidence under different bias assumptions for the RWE studies, which is more attractive for decision-making [16][17][18][19]. Relevant methods have been published in the statistical literature but have not been systematically verified or widely used in empirical analyses [16].
Therefore, the aim of this study is to investigate whether these statistical methods, including naïve data synthesis (NDS), design-adjusted synthesis (DAS), the use of RWE as prior information (RPI), and three-level hierarchical models (THMs), could be used for integrating RWE in a rare events meta-analysis of RCTs by considering the potential inherent biases of RWE studies and their impact on the level of uncertainty around the estimates.

Study Design Overview
In this study, we used four approaches to combine RCTs and RWE studies [18,20]: (1) NDS, where data from all studies, regardless of the study design, were directly combined; (2) DAS, where RCTs and RWE studies were synthesized and where the information from the RWE studies was adjusted to reflect the confidence in the study findings; (3) the RPI, which included RWE via the prior for the treatment effect and combined this with a likelihood based only on the data from RCTs; (4) the THM, which was used to simultaneously model the between-study heterogeneity of treatment effects within each study design (RCT or RWE studies) and across study designs. The methodologies were applied to two recently published meta-analyses concerning the risk of diabetic ketoacidosis (DKA) among patients using sodium/glucose cotransporter 2 (SGLT-2) inhibitors compared with active comparators [21] and the effectiveness of 23-valent pneumococcal polysaccharide vaccine (PPV23) vaccination against invasive pneumococcal disease (IPD) in elderly patients [22]. Here, we describe how these methods can be performed and how the bias of the RWE studies can be considered. Finally, we discuss the pros and cons of each method.

Illustrative Example Dataset
The first example was a recent meta-analysis conducted by Falkenhorst et al. [22]. In this meta-analysis, the authors used the evidence from RCTs and RWE studies to investigate the effectiveness of PPV23 vaccination against IPD and pneumococcal pneumonia in adults aged ≥ 60 years. Here, we focus on IPD. This study included four RCTs, five cohort studies, three case-control studies, and five case-case studies. We excluded one RCT (for reporting no events in both treatment groups) and five case-case studies (for IPD not being reported) in our study. The outcomes of these studies are displayed in Table 1. The authors assessed the risk of bias of individual studies using the Cochrane Risk of Bias tool [23] for RCTs and the Newcastle-Ottawa Scale for observational studies [24]. A total of 16 IPD events were reported across three RCTs, which randomized a total of 41,992 patients, and the incidence rate was 0.04%. We observed 165 IPD events among 575,454 patients in five cohort studies. The second example showed how to integrate RCTs and RWE studies when assessing the effects of rare adverse events. A typical example of rare adverse events is the risk of DKA caused by SGLT-2 inhibitors. The data used in this study were obtained from the study conducted by Alkabbani et al. [21]. This study included twelve placebo-controlled RCTs, seven active comparator RCTs, and seven active comparator RWE studies. All were retrospective propensity score-matched cohort studies. Our primary concern was whether SGLT-2 inhibitors increase the risk of DKA compared with active comparators. We excluded one RWE study because its control was not an active comparator [36]. In this study, the authors assessed the risk of bias in each study using the checklist proposed by Downs et al. [37] for RCTs and RWE studies. Table 2 shows the basic characteristics of the included RCTs and RWE studies. A total of 8 DKA events were reported across all RCTs, which randomized 8100 patients, and the incidence rate was 0.1%. We observed 2016 DKA events among 1,072,992 patients across all RWE studies. Table 2. Characteristics of studies included in the meta-analysis of the risk of diabetic ketoacidosis (DKA) among patients using sodium/glucose cotransporter 2 (SGLT-2) inhibitors compared with active comparators.

Methods for Incorporating RWE in a Rare Events Meta-Analysis of RCTs
In the following, we describe how these methods can be performed and how the bias of the RWE studies can be considered. We used j = 1, 2, . . . k to denote study j. The odds ratio (OR) metric was used as the effect measure because rare events are often binary outcomes. These methods are also suitable for other measures, such as the relative effect (RR) and risk difference (RD).

The Naïve Data Synthesis
The NDS is identical to the conventional random effects meta-analysis model, namely the normal-normal hierarchical model. The model was then written as follows: where d j and s 2 j are the study-specific empirical effect and the within-study variances for study j, respectively. Both the treatment effect (d i ) and variance (s 2 i ) are on the log OR scale. Here, θ j denotes the true underlying log-OR for the jth study, θ is the summary log-OR, and τ 2 is the between-study variance. We assigned a weakly informative prior for θ, i.e., θ ∼ N 0, 2.82 2 , following Günhan et al. [49]. For the heterogeneity parameter τ, we used a half-normal prior with a scale of 0.5 ( τ ∼ HN(0.5)), following Friede et al. [50].

The Design-Adjusted Synthesis
The DAS approach was originally used for integrating RCTs and RWE studies in network meta-analysis. We simplified this model to fit a pairwise meta-analysis. This model gives less weight to RWE studies compared to RCTs. This down-weighting is achieved by inflating the variance of the mean effect of RWE studies. The general conventional meta-analysis model was extended by including a down weighting factor ω j in the variance of the estimated treatment effect of study j: where ω j is the variance inflation factor for study j, and the other parameters are the same as the DAS; ω j was equal to 1 when the study type was an RCT, since the estimates obtained from RCTs are often considered unbiased. For the RWE studies, the ω j values ranged from zero to one, and the variance of the treatment effect was inflated so that the weight in the meta-analysis was decreased. We evaluated varying levels of trust in the RWE studies, with ω j~b eta (4, 1), ω j~b eta (1.5, 1), and ω j~b eta (0.25, 1) standing for high, medium, and lower levels of confidence in RWE studies. The prior specifications for θ and τ were identical to those of the NDS.

The Use of Real-World Evidence as Prior Information
This approach included RWE via the prior for the treatment effect and combined this with a likelihood based only on the data from RCTs. We first synthesized the RWE studies to generateθ RWE with a corresponding varianceV RWE . Then, we centered the informative prior for the population mean (θ) onθ RWE but use an inflated variance of the mean effect where ω is the variance inflation factor. The prior specification for τ is identical to the NDS. We also evaluated varying levels of trust in the RWE studies, with ω j~b eta (4, 1), ω j~b eta (1.5, 1), and ω j~b eta (0.25, 1).

The Three-Level Hierarchical Model
For the THM, the evidence from the RCTs and RWE studies were modeled separately at the within-study and within-design levels, and then the estimates for RCTs and RWE studies were combined in an overall measure of the treatment effect using the random-effects model, assuming i = 1, 2, where 1 represents the RCTs and 2 represents the RWE studies. The model is presented as follows: where d j,i and s 2 j are the study-specific empirical effect and the within-study variances for study j and design i, respectively; θ j,i denotes the true underlying effect for the jth study and ith design; θ i and τ 2 i are the summary effect and the variance for the ith design, respectively; and the other parameters are the same as the NDS. The RWE studies could be down-weighted by inflating the variance; that is, θ 2 ∼ N(θ, τ 2 /ω). The prior specifications for θ and τ were identical to those of the NDS. For the study-design variance τ 2 i , we also assigned a half-normal prior with a scale of 0.5. The prior specifications for ω were also set in three different scenarios, namely ω j~b eta (4, 1), ω j~b eta (1.5, 1), and ω j~b eta (0.25, 1).

Implementation and Model Fit
For RCTs, the effect of the single-zero event trial was estimated by the method proposed by Greenland et al. [51]. All models were performed using Hamiltonian Monte Carlo algorithms, as implemented in the RStan package (version 2.21.2) in the R statistical environment (version 4.0.3, R Foundation for Statistical Computing, Vienna, Austria). As a comparison, we estimated the results of RCTs using a Bayesian random effects metaanalysis. We fitted four chains for each model, each with 5000 iterations, the first half of which were considered warmups and discarded. Convergence was judged to have occurred whenR (the potential scale reduction factor) was no greater than 1.1 for all parameters [52].

Effectiveness of PPV23 Vaccination against IPD in Elderly Patients
The Bayesian meta-analysis including only RCTs had a posterior median of 0.29 and a 95% credibility interval (CrI) [0.06, 1.45], which indicated that based on the RCT evidence, the PPV23 is not effective against IPD. The posterior median and 95% CrI based on the NDS were 0.42 and (0.29, 0.58), which indicated the effectiveness of PPV23 against IPD in elderly patients. Table 3 shows the posterior median and 95% CrI of the effectiveness of PPV23 vaccination against IPD in elderly patients using the DAS, RPI, and THM. The models, except for the THM, consistently showed the effectiveness of PPV23 vaccination against IPD in elderly patients. By comparing the length of the intervals between the NDS and the other three models, we found that the length of the interval of the NDS was much smaller than that of the other three models. We also observed that the length of the interval in the estimates obtained from the four models was much smaller than the estimates from only RCTs.

Risk of DKA among Users Receiving SGLT-2 Inhibitors Versus Active Comparators
We did not observe an increased risk of DKA when the results were pooled from the RCTs (OR = 1.02, 95% CrI: 0.37-2.86), while we observed an increased risk of DKA when directly combining the RCTs and RWE studies using NDS (OR = 1.58, 95% CrI: 1.22-2.04). The estimates of each model under different variance inflation factors, which represent different levels of confidence in RWE studies, are presented in Table 4. The results of the different models were inconsistent. The DAS showed that SGLT2 inhibitors were associated with an increased risk of DKA under different confidence levels placed in RWE studies. For the RPI, we observed an increased risk when we placed a high level of confidence in RWE studies (OR = 1.56, 95% CrI: 1.12-2.12); however, we did not observe an increased risk when we placed a medium level of confidence (OR = 1.52, 95% CrI: 0.98-2.19) or a low level (OR = 1.41, 95% CrI: 0.56-2.45) of confidence in RWE studies. We also did not observe an increased risk of DKA by using the THM, even though we placed a high level of confidence in RWE studies (OR = 1.42, 95% CrI: 0.50-3.75). As with the first example, the interval lengths of the four model estimates were smaller than the outcomes estimated based on RCTs alone.

Discussion
In this study, we discussed approaches for incorporating RWE in a rare events metaanalysis of RCTs, with particular interest in the effectiveness of PPV23 against IPD in elderly patients and the risk of DKA among patients using SGLT-2 inhibitors compared with active comparators. We did not observe the effectiveness of PPV23 and the risk of DKA among patients using SGLT-2 inhibitors when the data were only from RCTs. In the effectiveness example of PPV 23, all methods, except for the THM, found that the inclusion of RWE studies in rare events meta-analysis of RCTs could complement the evidence of the findings of the RCT-only analysis. However, for the risk of DKA among patients using SGLT-2 inhibitors, the results of the different models were inconsistent. For instance, for the RPI, an obvious finding was that we did not observe an increased risk when low and medium confidence was placed in the RWE studies. The results showed that the inclusion of RWE studies during the evidence synthesis process could increase the certainty of the estimates when the rare events meta-analysis of RCTs could provide enough evidence, but this depended on the method of inclusion and the confidence level placed in the RWE studies.
We do not recommend using NDS as the main method of analysis to combine rare events data. A recent scoping review found that the NDS approach was the most frequently used in empirical analyses [16]. However, NDS ignores differences in study designs and cannot consider the potential bias of RWE studies [20]. Furthermore, compared with RCTs, the results of RWE studies often show a large effect because of some uncontrolled confounding bias factors [53], and their interval estimates are much smaller because the events and the sample size are usually much larger [21]. Therefore, with the inclusion of RWE studies in a rare events meta-analysis of RCTs using NDS, not only can the bias of RWE studies not be adjusted but it would also give a larger weight than that of RCTs. Our two illustrative examples were also confirmed. This indicated that addressing the bias of RWE studies leads to more informed decision-making when integrating RCTs and RWE to assess the effects of rare events.
For the DAS, the data from the RWE studies are down-weighted based on the confidence levels of their credibility. An obvious feature of this model is that it needs to separate assessments of bias for each RWE study, which could be done by eliciting expert opinions regarding the bias parameters [54]. Another useful method is to gauge the impact on the model estimates by varying the amount of confidence placed in RWE studies [20]. For the RPI, there are two key differences with the DAS: one is in the estimation of heterogeneity that is performed separately for RCTs and RWE studies, and the other is that it only needs to set the magnitude of possible bias for the total RWE. Therefore, this model may be more attractive in empirical analyses, because one can not only set the prior distribution of heterogeneity parameters according to the characteristics of RCT and RWE studies (such as patient follow-up and statistical analysis methods), but can also set the magnitude of bias based on the risk of bias evaluation results for the whole RWE rather than for each RWE study.
Although the advantage of the THM is that it can obtain the estimates of each study design to increase its precision, the results of the total combined effect are conservative. In these two examples, compared to the results of the other models, the most notable difference was that the interval length was much larger when using the THM, even though we placed a high level of confidence in the RWE studies. This was because the THM explicitly considers the heterogeneity between study designs, allowing for additional variability across studies [20]. The main goal of the inclusion of RWE studies in a rare events meta-analysis of RCTs is to increase the power in testing whether the true effect exists. Rare events are often associated with safety results such as serious adverse events, indicating that the THM as the main method of analysis may not be suitable and may increase the chances of patient exposure to unnecessary danger [55]. The estimates can be improved by using expert opinions and meta-epidemiology to set a WIP for the combined effects [56]. For instance, constructing a WIP works via the consideration of the prior expected range of the treatment effect [57] and the heterogeneity values [58].
There were some limitations of this study that need to be recognized. First, this study only utilized two illustrative examples, the results may differ in other outcomes, and broader analyses in further empirical analyses are necessary. Second, this study only investigated methods for the inclusion of RWE in rare event meta-analyses of RCTs, and the performance of these methods was not explored. There is no clear rule for choosing the method to combine RCTs and RWE studies; when different methods lead to contradicting conclusions, researchers can choose those methods that lead to the desired outcome. Thus, a further evaluation of these methods in different scenarios, including the use of comprehensive simulation studies, is warranted. Third, meta-regression was not considered in this study. Although meta-regression can explain some between-study heterogeneity, it may be limited by the information on available covariates of each study or the number of studies in a meta-analysis. Fourth, although we used the OR as the effect measure, these methods are generalizable to other measures of association frequently used in meta-analyses, such as the RR, RD, and weighted mean differences. The performance of these methods in other measures needs to be further explored.

Conclusions
In summary, including RWE in a rare events meta-analysis has the potential to corroborate findings from RCTs, increase precision, and enhance the decision-making process, but this depends on the method of inclusion and the assumption for the magnitude of bias risk of RWE. The statistical performance of these approaches requires further evaluation in different scenarios of empirical or simulation studies.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/jcm12041690/s1, supplementary file: R code for the methods included this study.

Conflicts of Interest:
The authors declare no competing interest.